Add optional "Show Cost (USD)" toggle to the leaderboard#42
Open
lakshvantb wants to merge 8 commits into
Open
Conversation
Surfaces per-task / per-category / per-model cost beside scores, mirroring how scores are displayed. Cost is a muted second line inside each existing metric cell; OFF by default, so the table is unchanged unless toggled on. Data is a parallel, feature-detected public/cost_<date>.csv with the SAME header shape as table_<date>.csv but USD totals per task. Cost rolls up by SUM (totals, not means): subtask -> category -> model. Models present in the cost file show totals; every other model shows a single "n/a" (in the Global Average cell, or the first metric cell when one category is selected). Dates without a cost file hide the toggle entirely and render exactly as before. - Averaging.js: add sumColumns() (sum twin of calculateAverage) - CSVTable.jsx: showCost state + `cost` URL param, feature-detected cost fetch, Show Cost checkbox (only when a cost file is loaded), per-cell cost render, reset in Clear Filters - App.css: .cost styling (muted, tabular-nums; hidden < 600px) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6cec09a to
dd72fdb
Compare
TEST DATA for exercising the Show Cost (USD) feature — real per-task USD totals computed from verified input+output tokens x cost_per_million (or stored cost_usd) for 18 board models with correct token tracking. Normal tasks only, so the three Agentic Coding columns (javascript/python/typescript) render "—" for covered models. Models without pricing or not on the board show "n/a". Replace with official rerun output before any production use. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…le-count) Addresses verification findings: csv.writer was emitting CRLF (table uses LF), and the question_id->task join cross-labeled consecutive_events vs integrals_with_game. Now assigns task by directory and keeps the latest-run version per (model,task). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
.other-controls now uses flex-wrap; previously the row overflowed horizontally and the rightmost toggles (Show High Unseen Bias, the new Show Cost) were clipped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
.other-controls is now centered (justify-content: center) so the wrapped toggle line is balanced; Clear Filters moved into a centered .clear-filters-row below the toggles instead of sitting inline. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
.table-container is a column flex with align-items: flex-start, so child rows shrank to content width and pinned left. align-self: stretch makes the rows full width so justify-content: center actually centers Clear Filters (and the toggles).
- Clear cost state when the date changes so a model never briefly shows the previous date's cost while the new cost file loads. - Replace the dense per-cell cost ternary with a small costCell(columns, isAnchor) helper; behavior identical, far more readable. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a Show Cost (USD) toggle to the leaderboard. When on, each score cell gains a muted second line with cost, mirroring how scores are shown at every level (model total / category / subtask). Off by default — the table is byte-for-byte unchanged unless toggled.

How it works
public/cost_<date>.csvwith the same header shape astable_<date>.csv(first colmodel, then the task columns) but holding USD totals per task.n/a— in the Global Average cell, or the first metric cell when only one category is selected. No allowlist or "top-N" logic in the UI.cost_<date>.csvhide the toggle and render exactly as today (fetch is feature-detected; guards against SPA 200-with-index.html fallbacks).c/cost=trueURL param, consistent with the other display toggles; reset by Clear Filters.Files
src/Table/Averaging.js—sumColumns()(sum twin ofcalculateAverage)src/Table/CSVTable.jsx— state + URL param, cost fetch, checkbox (only when a cost file is loaded), per-cell render, resetsrc/App.css—.coststyling (muted,tabular-nums; hidden < 600px)Activation
Drop a
cost_<date>.csv(same model keys astable_<date>.csv, USD totals per task — key by the displayed model name so variant-collapse matches) intopublic/. Only those models show cost; the rest shown/a. Initial rollout will populate the top models from a rerun; the file is sparse by construction.Testing
npm run buildcompiles cleanly (only pre-existing lint warnings).categories_2026_01_08.json: category totals and the global total reconcile (e.g. Agentic Coding$38.00, python subtask$14.00, global$45.21); uncovered models render a singlen/a.🤖 Generated with Claude Code